Physical Data Warehouse Design on NoSQL Databases - OLAP Query Processing over HBase

نویسندگان

  • Lucas C. Scabora
  • Jaqueline Joice Brito
  • Ricardo Rodrigues Ciferri
  • Cristina Dutra de Aguiar Ciferri
چکیده

Nowadays, data warehousing and online analytical processing (OLAP) are core technologies in business intelligence and therefore have drawn much interest by researchers in the last decade. However, these technologies have been mainly developed for relational database systems in centralized environments. In other words, these technologies have not been designed to be applied in scalable systems such as NoSQL databases. Adapting a data warehousing environment to NoSQL databases introduces several advantages, such as scalability and flexibility. This paper investigates three physical data warehouse designs to adapt the Star Schema Benchmark for its use in NoSQL databases. In particular, our main investigation refers to the OLAP query processing over column-oriented databases using the MapReduce framework. We analyze the impact of distributing attributes among column-families in HBase on the OLAP query performance. Our experiments showed how processing time of OLAP queries was impacted by a physical data warehouse design regarding the number of dimensions accessed and the data volume. We conclude that using distinct distributions of attributes among column-families can improve OLAP query performance in HBase and consequently make the benchmark more suitable for OLAP over NoSQL databases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Business Intelligence and Nosql Databases

NoSQL databases become more and more popular, not only in typical Internet applications. They allow to store large volumes of data (so called big data), while ensuring fast retrieving and fast appending. The main disadvantage of NoSQL databases is that they do not use relational model of data and usually do not offer any declarative query language similar to SQL. This raises the question how No...

متن کامل

Implementing Multidimensional Data Warehouses into NoSQL

Not only SQL (NoSQL) databases are becoming increasingly popular and have some interesting strengths such as scalability and flexibility. In this paper, we investigate on the use of NoSQL systems for implementing OLAP (On-Line Analytical Processing) systems. More precisely, we are interested in instantiating OLAP systems (from the conceptual level to the logical level) and instantiating an aggr...

متن کامل

Análise Experimental de Bases de Dados Relacionais e NoSQL no Processamento de Consultas sobre Data Warehouse

Data warehouse (DW) is a large, oriented-subject, non-volatile, and historical database, and an important component of Business Intelligence. On DW are executed OLAP (Online Analytical Processing) queries that often culminate in a high response time. Fragmentation of data, materialized views and indices aim to improve performance in processing these queries. Additionally, NoSQL (Not only SQL) d...

متن کامل

Dynamic Data Warehouse Design as a Refinement in ASM-based Approach

On-line analytical processing (OLAP) systems deal with analytical tasks in businesses. As these tasks do not depend on the latest updates by transactions, it is assumed that the data used in OLAP systems are kept in a data warehouse, which separates the input from operational databases from the outputs to OLAP. Typical OLAP queries are data intensive, and thus time consuming. In order to speed ...

متن کامل

Distributed RDF Triple Store Using HBase and Hive

The growth of web data has presented new challenges regarding the ability to effectively query RDF data. Traditional relational database systems efficiently scale and query distributed data. With the development of Hadoop its implementation of the MapReduce Framework along with HBase, a NoSQL data store, the semantics of processing and querying data has changed. Given the existing structure of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016